Improving text clustering for functional analysis of genes

نویسندگان

  • Jing Ding
  • Srinivas Aluru
  • Julie Dickerson
  • Sigurdur Olafsson
چکیده

vi Chapter 1 Literature Review and Requirements Analysis 1 1.1 Functional analysis of microarray data 1 1.2 Gene Ontology-based functional analysis 1 1.3 Literature-based functional analysis 4 1.3.1 Assuming similar expressions imply same functional pathway 5 1.3.2 Not assuming similar expressions imply same functional pathway 7 1.4 Hybrid systems 10 1.5 Requirements analysis 11 Chapter 2 Strategic Design 14 2.1 Design overview 14 2.2 Two-step vs. one-step clustering design 15 2.3 Document representation 16 2.4 Choice of text clustering algorithm 18 Chapter 3 BOW-Based System: GeneNarrator 1 20 3.1 Architectural overview of GeneNarrator 1 20 3.2 Detailed description of individual modules 21 3.2.1 DocBuilder 21 3.2.2 LongBOW 21 3.2.3 CrossBOW 23 3.2.4 ArrowSmith 24 3.2.5 GeneSmith 25 3.2.6 BOWviewer 25 Chapter 4 Evaluation of GeneNarrator 1 27 4.1 The gold standard gene list and document set 27 4.2 Evaluating clustering: literature review 28 4.2.1 Subjective judgment 28 4.2.2 Cluster quality measures 29

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Text Clustering for Functional Analysis of Genes Computer Engineering and Bioinformatics and Computation Biology

Continued rapid advancements in genomic, proteomic and metabolomic technologies demand computer-aided methods and tools to efficiently and timely process large amount of data, extract meaningful information, and interpret data into knowledge. While numerous algorithms and systems have been developed for information extraction (i.e. profiling analysis), biological interpretation still largely re...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

Protein-Protein Interaction Analysis of Common Top Genes in Obsessive-Compulsive disorder (OCD) and Schizophrenia: Towards New Drug Approach

Comorbidty is common among psychiatric disorders including obsessive-compulsive disorder and schizophrenia with a high rate. Many studies suggested that the disorders may have same etiological bases. In this regard, shared pathways of glutamate, dopaminergic, and serotonin are the known ones. Here, the common significant genes are examined to understand the possible molecular origin of the diso...

متن کامل

Protein-Protein Interaction Analysis of Common Top Genes in Obsessive-Compulsive disorder (OCD) and Schizophrenia: Towards New Drug Approach

Comorbidty is common among psychiatric disorders including obsessive-compulsive disorder and schizophrenia with a high rate. Many studies suggested that the disorders may have same etiological bases. In this regard, shared pathways of glutamate, dopaminergic, and serotonin are the known ones. Here, the common significant genes are examined to understand the possible molecular origin of the diso...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006